An algebraic analysis of the two state Markov model on tripod 

trees 

Steffen Klaere^'*, Volkmar Liebscher^ 

""Department of Mathematics and Statistics, University of Otago, Dunedin, New Zealand 
^Institut fiir Mathematik und Informatik, Universitdt Greifswald, Germany 



Abstract 

Methods of phylogenetic inference use more and more complex models to generate trees 
from data. However, even simple models and their implications are not fully understood. 

Here, we investigate the two-state Markov model on a tripod tree, inferring 
conditions under which a given set of observations gives rise to such a model. This type 
of investigation has been undertaken before by several scientists from different fields of 
research. 

In contrast to other work we fully analyse the model, presenting conditions under 
which one can infer a model from the observation or at least get support for the 
tree-shaped interdependence of the leaves considered. 

We also present all conditions under which the results can be extended from 
tripod trees to quartet trees, a step necessary to reconstruct at least a topology. Apart 
from finding conditions under which such an extension works we discuss example cases 
for which such an extension does not work. 

Keywords: Phylogenetics, Identifiabihty, Invariant, Two-State-Model 



1. Introduction 

In phylogeny, one assumes that the relationship of a set of taxonomic units (or 
taxa) can be visualised by a (binary) tree. The aim is to derive this tree from the 
observations at the taxa. Prom a stochastic modelling point of view, one assigns the 
taxa to the leaves of a (binary) tree, and assumes that the observations (which are 
usually considered to be i.i.d. over different sites) are the end results of a Markov 
process along the tree. The goal is to derive the best combination of tree and Markov 
model to explain the observations. 

This work regards the identifiabihty problem of this inference. It essentially asks 
whether it is possible that infinite data sets are able to uniquely identify the transitions 
on the tree and the tree completely. Note that in the present context, identifiabihty 
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12 readily leads to consistency of various methods of estimating the parameters of the 

13 model [seeHl Section 2.2 for an overview]. 

14 However, usually one only has an estimate of the leaf distribution such a process 

15 induces. This leads to the question of whether one can find (simple) conditions to 

16 determine whether a taxon distribution comes from a Markov process. In other words, 

17 we ask whether we can validate the model, at least if there are infinitely many data 

18 points available. 

19 To approach this problem, we consider a very simple model. We assume that our 

20 process can take only one of two states for every site, and that the tree is a tripod tree. 

21 Under these restrictions, we can completely describe the map from the taxon 

22 distribution to the parameters of the model, including necessary and sufficient 

23 conditions on positivity of the parameters. Thereby, no conditions for reversibility of 

24 the processes on the edges are needed. The analysis of the model on tripod trees has 

25 immediate consequences for quartet trees. We derive these conditions to exemplify the 

26 shortcomings of an extension from tripods to quartets. 

27 Technically, the generic part of this work is already well-known. Initial work on 

28 the two state model from psychology can be found in Lazarsfeld and Henry [3j. Pearl 

29 and Tarsi |1] used these results in artificial intelligence to algorithmically identify the 

30 whole tree behind two-state Markov models. Note that identifiability of Markov models 

31 especially in phylogeny was studied in Allman et al. [5], Allman and Rhodes 

32 [6, 7J, Baake |8j, Chang [2]. We add to those results the analysis of the degenerate 

33 cases, together with a complete analysis of the quartet tree model. 

34 The typical tool (for multi-state models) to identify a subspace of taxon 

35 distributions which might come from a Markovian tree model are phylogenetic 

36 invariants [6] E El UHl [HI, [12] . Those invariants are polynomials in the taxon distribution 

37 which are zero for those distributions that are derived from the model of interest. 

38 Sumner et al. [13] discuss another very interesting set of invariants, the so-called 

39 Markov invariants. These are invariants whose value on a tree scales with the 

40 determinants of the Markov matrices on the edges. Thus, Markov invariants indicate 

41 simple relations between the observations (the distribution of leaf states) and the model 

42 (described by the Markov matrices), and provide conditions on the observations based 

43 on properties of the model. We will make use of this property in this work. 

44 In the two-state tripod case there is only one, the trivial invariant. But, not all 

45 leaf distributions are derived from the Markov model. In fact, we derive polynomials 

46 that vanish on distributions which satisfy the trivial invariant but are not identifiable 

47 under the Markov model. To accommodate this observation we suggest incorporating 

48 these polynomials into the set of invariants but with the addition that these 

49 polynomials do not vanish for identifiable distributions. We discuss degenerate 

50 distributions to describe this observation. 

51 Although most of the leaf distributions allow for complex solutions of the model 

52 equations, in order for the solution of the algebraic equation to be parameters of a 

53 Markov model additional inequalities must be fulfilled [151 [13 HI] • The approach of 
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Matsen is restricted to the Cavender-Farris-Neyman model [CFN [TSl UHl 120] to 

55 accommodate the Hadamard approach [2T1[22]. Yang [T7] investigated the CFN model 

56 to explore conditions to obtain solutions for different optimisation problems in 

57 phylogeny. Extending our approach we recover the inequalities presented in Pearl and 

58 Tarsi [4J. 

59 As a final step we investigate how the results for tripod trees extend to trees of 

60 four leaves. The results provide a glimpse at what we can expect from the 

61 reconstruction from tripods when we have no knowledge of the identifiability of the 

62 given taxon distribution. 

63 The structure of this work is as follows: In Section |2] we describe the general 

64 mutation model on a tree, with specialisation to tripod trees coming in Section |3j 

65 Section |4] deals with the complete solution of the two-state tripod tree model. Then, in 

66 Section |5] we use these results to analyse the general two-state Markov model on quartet 

67 trees. Section [6] discusses the relation between our work and the concept of Markov 

68 invariants, and possible extensions of this work. ] For the sake of readability, proofs are 



69 presented in [Appendix A 



2. The Markov model of mutation along a tree 

In this section we introduce the general Markov model and its properties. Pearl 
and Tarsi [4J nicely motivate this model in the following way. Assume, one is given a set 
L of taxa and a set of observations from a Markov process X : L — )■ {0, 1}. From these 
observations one deduces a correlation between the taxa. The assumption is that this 
correlation can be explained by an underlying (binary) tree T = (V, E) and an 
extension Y : V ^ {0, 1} of X such that for any pair of taxa there is an interior node 
such that given the state at the interior node the two taxa are independent. See Fig. 



A.l for a depiction of this. 

[Figure 1 about here. 



Let us look closer at the process Y . The independence of pairs of taxa given an 
interior node on the path between them corresponds to the so-called directed local 
Markov property [e.g., [23l Chapter 2]. For this property one has to identify a node 
C G y as the root of the tree and direct all edges away from C,. Thus, our tree becomes 

84 a directed acychc graph, and for every node (3 &V \ {C} there is a parent node a & V 

85 (with respect to the root), such that (a, (3) G E. Further, for each node {3 &V one 

86 defines the set its descendants as those nodes a for which the path from the root to a 

87 passes through /3. The non-descendants are then the nodes that are neither descendants 

88 nor parents. 

89 The directed local Markov property states that conditioned on the state of its 
parent node the state of a node a & V is independent of the states of its 
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91 non-descendants. With this property the joint distribution has the factorisation 

92 property^ i.e. for the joint state x ^ {0? 1}'^' we get 
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93 Here, the marginal distribution q'^ corresponds to the initiahsation of the process, i.e. 

94 is the probabihty that the process attains state z G {0, 1} at the root. The transition 

95 matrices {M'^)e,^E describe the way the process progresses along an edge. E.g., for an 

96 edge [a, 13) & E the term M"jf is the probability that the character a at node a is 

97 mutated into character h at node /3. 

98 In summary, the joint probability distribution is given by the marginal 

99 distribution and the transition matrices {M'^)e^Ei and thus such a Markov process is 
IOC completely characterised by these parameters. We will call q^ and {M^)f.i^E the process 
101 parameters. 

In general, the actual position of the root node C, is not important for ([T]), i.e. C, 
can be chosen arbitrarily from V , including a leaf [e.g., 17j. 

104 We only have partial knowledge on the realisations of the process Y through the 

105 process X on the leaves. The joint distribution of X can then be inferred from ([T]) 

106 using the law of total probability. Let x G {0, l}'^' denote the joint state at the leaves. 
Then 



E = E n KL,- (2) 



x\l=x x\l=x 



108 Note that under the assumption that X comes from a reversible Markov process Y 

109 Chang [2] proved that all process parameters can be recovered from all the distributions 
no of the restrictions of X to arbitrary triples of taxa. 

111 If we find process parameters for a joint taxon distribution p then we call p tree 

112 decomposable. If the obtained process parameters are unique (up to model-specific 

113 symmetries), we call p algebraically identifiable, and if further the process parameters 

114 are marginal and transition probabilities, then p is called stochastically identifiable. 

115 Clearly, any stochastically identifiable distribution is also algebraically identifiable. 

116 Looking at ^ we realise that verifying the tree decomposability of a distribution 

117 p is equivalent to solving a polynomial equation system of 21^' — 1 independent 

118 equations in 4|L| — 5 variables. We observe that the Markov equations are 

119 overdetermined for \L\ > 3, i.e. the space of tree decomposable distributions is a proper 

120 subspace of the space of all distributions. From this we conclude, that there are 

121 conditions that define a tree decomposable distribution. These conditions are generally 

122 known as invariants, polynomials in 21^' — 1 variables whose roots are distributions that 

123 are tree decomposable. One example of an invariant is 

E ^- = 1' (3) 

£ce{o,i}l^l 
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124 i.e. all probabilities sum to one. This is fittingly called the trivial invariant. Allman 

125 and Rhodes provide a complete set of invariants for trees of arbitrary size under a 

126 two-state-model, and observe that for complete identification the knowledge of the 

127 restrictions to six taxa are necessary. 

128 However, as pointed out in multiple publications [e.g.,|ll[T3| such invariants are 

129 not sufficient to guarantee tree identifiability. In particular, additional inequalities are 

130 needed. 

131 Here, we are not only interested in recapturing invariants and inequalities. In 

132 addition, we also investigate those distributions that are not algebraically identifiable or 

133 not tree decomposable at all to discuss their impact on invariant-based inference. 

134 3. General properties of a Markov model on a tripod tree 

135 The starting point of our analysis is the tripod tree T with taxa a, /3, 7, interior 



136 node C and edges (C,^), (C)/^), (CjT) (see Fig. A. 2). This is the only labeled topology 

137 for three taxa. Hence any inference will be process- and not topology-related. Allman 

138 and Rhodes [7] select a taxon as the root for their approach. We will place the root at 

139 the interior node for the symmetry this provides in the tree equations. 

140 [Figure 2 about here.] 

141 As stated in the previous section, if the joint distribution p of X^, X^, comes 

142 from a Markov process then there are parameters q'^, M^, M"' such that the 

143 Markov equations ^ are satisfied. On a tripod tree these equations are the tripod 

144 equations 

Pabc = qiM^aM^,Ml + il-qi)M^,M^,Ml, a,6,ce{0,l}. (4) 

145 As before we call p tree decomposable, if there are parameters, algebraically identifiable, 

146 when the parameters are unique (up to some symmetries discussed later), and 

147 stochastically identifiable if the parameters are unique and proper marginal and 

148 transition probabilities. 

149 The works of Lazarsfeld and Henry ^ and Pearl and Tarsi [4j were mainly 

150 interested in inferring conditions under which a triplet distribution is stochastically 

151 identifiable. While recovering their results we also investigate tree decomposability and 

152 algebraic identifiability in order to describe their impact on invariant-based inference. 

153 For three taxa the only invariant is the trivial invariant. Thus, one could expect 

154 that all triplet distributions are tree decomposable. As we will see later, this is not the 

155 case. In fact, we will present polynomials whose roots satisfy the trivial invariant but 

156 are not tree decomposable. 
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157 3.1. Statistics for binary models 

158 Following Pearl and Tarsi [1] we identify the symbols and 1 with their actual 

159 integer counterparts. This permits us to introduce a set of terms that are very helpful 

160 for later steps of the analysis. We start by introducing the following abbreviations: 

£^a/37 := EXqX/jX^ = Pr[Xa = 1, Xp = 1, = 1] = Pm, 
Eap := = Pr[Xa = l^Xp = l]= pns = Pno + Pin, 

Ea := EXc, = Pr[X„ = 1] = Piss = Pioo + Pioi + Piio + Pin- 

161 The symbols pus and its modifications pisi etc. are direct consequences of the 

162 application of the law of total probability to the equation system Q. These terms are 

163 also known as marginalisations leading to a removal of a random variable from 

164 consideration by summing over its states. This linear modification means we can study 

165 the tripod equations Q also in terms of its marginalisations. 

166 In the case of the binary model the above symbols for all A & L correspond to 

167 the joint mean of the random variables for the taxa in A. Using these definitions we can 

168 introduce simple terms which correspond to the covariances between the set of random 

169 variables: 

Tap := Cov[X„, Xp] = EX^Xp - EXaEXfs, 

170 with equivalent definitions for r^y and r^^. Of further interest are the following terms 

171 (cG {0,1}) 

Ta^lc ■= PllcPSSc - PlScPSlc, 

172 with equivalent definitions for Ta^\b, b G {0, 1} and Tp^\a, a G {0, 1}. These terms are 

173 actually multiples of the conditional covariances, Cov[Xq,, = c] = Tapic/p-E'Ec- 

174 Finally, we also introduce the three-way covariances 

Tai3^ := Cov[X„, Xp, X^] = E(X„ - EXa){X[s - EX,3){X^ - EX^) 

^af}-y ^a^p-y ^jB^a-y ^-y^af} ~l~ 2£q,£^£^. 

175 For a review on covariance for more than two random variables see e.g. Rayner and Beh 

176 [24j. The term Tap^y describes the interactions of the three leaves considered. Sumner 

177 et al. [13] call this term a stangle, a stochastic tangle, highlighting its relation to 

178 entangled states of qbits in quantum mechanics. The three-way covariances are zero in 

179 the case of symmetric models like CFN, which also reflects the findings in Baake [S]. 

180 However, for more complex models the three-way covariances are needed as indicated 

181 by the findings of Chang [2]. 

182 Since covariances are a measure of interdependence of random variables, and 

183 because the identification of a tree and a Markov model is an interpretation of the 

184 interdependence in terms of hidden variables and conditional independence, looking at 

185 these covariances is a very logical way to verify whether or not such an interpretation is 

186 admissible. Using these terms we can immediately propose a useful property. 
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187 Lemma 1. Let p denote the joint probability for binary random variables X^, Xf^ and 

188 X^. // we flip the state in one taxon, then we flip the signs in its pairwise covariances. 

189 E.g., if Xa I— 7- 1 — Xa, then Tap l— > —Tap, Ta-y ^ —Ta-y Tp^ (-)■ Tp^ . 

190 One immediate consequence of this observation is that the product TapTa-yTp^ 

191 always has the same sign no matter how much we flip states. 

192 3.2. Tree properties 

193 In this section we assume that p is tree decomposable and regard some immediate 

194 consequences. We will later see that these conditions are necessary for identifiability 

195 but not sufficient. Nevertheless, these conditions provide some immediate insights for it. 

196 Lemma 2. 1. If a triplet distribution p is tree decomposable on T with Tap = 0, then 

197 also Tap-y = and Ta^ = or Tp^ = 0. 

198 2. // a triplet distribution p is stochastically identifiable then the product TapTa^Tp^ is 

199 non-negative. 

200 The non-negativity of the product has already been verified by Lazarsfeld and 

201 Henry [3]. With Lemma [l] it is not complicated to derive that on a star tree (with 

202 arbitrary number of leaves) there always is a state flipping such that all pairs of leaves 

203 are positively correlated. 

204 Corollary 3. Suppose we are given a stochastically identifiable distribution p on a tree 

205 with finite leaf set L such that the pairwise covariances do not vanish, i.e.. Tap 7^ for 

206 all a,P E L. Then there exists a set of leaves Lq G L such that flipping the states of the 

207 leaves in Lq yields all covariances Tap, a,f3EL, being positive. 

208 Lemma [2]^ 1) occurs exactly if Xa or Xp is independent of the remaining random 

209 variables. It also implies the following: 

210 Corollary 4. A triplet distribution p with Tap = but Ta-y 7^ and Tp^ ^ is not tree 

211 decomposable. 

212 Thus we already see, that the trivial invariant does not characterise tree 

213 decomposable distributions in this setting. The following example shows that such cases 

214 can be easily constructed. 

215 Example 1. Triplet distributions of type 

P = (Pooo, Pool, Poio, Poll, PlOO,Pl01,PllO, Pill) 
= (4 - X, X, 2, 2, 2, 2, 2, 2)/16, x G [0, 4] \ {2}, 

216 yield Tap = but Tq,^ = Tp^ = (2 — x)/32 and hence are not tree decomposable. In fact, 

217 for binary variables a much more complicated graphical model with more "inner" nodes 

218 and edges is needed to explain theses covariances. 
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219 4. Solving the tripod equations 

220 In this section we are given a triplet distribution p and infer conditions under 

221 which it is algebraically identifiable. For each case we will present an example. 

222 4-1- The algebraic solution 

223 As has been pointed out multiple times, the only invariant in the tripod case is 

224 the trivial invariant. In other words, the "set" of invariants for a tripod tree is satisfied 

225 by all triplet distributions. However, as we have seen in Corollary |4] there are triplet 

226 distributions that are not tree decomposable even though they satisfy the trivial 

227 invariant. Thus executing the actual decomposition, i.e. finding a solution for the 

228 tripod equations not only provides complete forms for the parameters but is also helpful 

229 to identify further cases. The first task is to clarify up to which level of uniqueness the 

230 decomposition of a triplet distribution can be attained. To do this we look at the 

231 implications of a state-fiip at the root. 

232 Lemma 5. // a triplet distribution p is tree decomposable with parameters ^ ^ 

233 qr^, Af", ^ M'^ then it is also tree decomposable for parameters (f , M°', M^, M"' 



234 



wtth ql = gL., = MH., Mf, = Mf^_ M], = 



(l-z)ai ^^-^zb ^^■^{l-z)bi ^^-^zc ^^■^{l-z)c' 

235 Hence, except for the case where everything is equal to 1/2, there will always be 

236 at least two sets of parameters that decompose a triplet distribution p. In terms of 

237 molecular evolution one can view these solutions as having either few mutations 

238 < M^^, S leaf) or many mutations > M^^, S leaf) for the other. Chang 

239 |2j addressed the problem of symmetric solutions by introducing matrix categories that 

240 are reconstructible from rows. One such class consists of diagonally dominant matrices, 

241 i.e. Mf^ > M^^^_^s^ for all leaves and z G {0, 1}. If only these two sets of parameters exist 

242 then we will regard the associated distribution as algebraically identifiable. It should be 

243 noted that the set of symmetric solutions increases with the number of parameters, i.e. 

244 each possible permutation of the states at the root yields a new solution. This fact has 

245 also been observed by Chang [2] in the case of the time-continuous Markov model. 

246 Next, we present conditions under which p is algebraically identifiable and present 

247 the closed form for the parameters. 

248 Theorem 6. Let p denote a triplet distribution and assume 

^-(^). (5) 

249 Then p is algebraically identifiable. The associated parameters have the following form: 

4 



2 2yx 



Mo^=ea+ — , Moi=£;3+ — , Mo\=£^+ — , (6) 

ZT^^ ZTa-y ZTa/3 

M^^-Sa^ ^ , Mil- H ^ ) iWii-e^H , 

ZTfSj ZTa^ ZTap 
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250 where x = ^Ifi^, + 4r„^ro^r^^. 

251 Note that Pearl and Tarsi |4j presented a similar solution for the parameters. 

252 Looking at the parameters in ([6]) we see that algebraically the conditions in (|5]) prevent 

253 division by zero. Together with the trivial invariant we can thus claim that the space of 

254 algebraically identifiable triplet distributions is given by 5 \ (5o U S\) with 



S 

So 
Si 



{peRl: Pooo + ■ ■ ■ + Pill = 1}, 

{p e S : TapTa^Tp^ = 0}, 

{pes : rlp^ + ^T^pT^^Tp^ = 0}. 



255 Considering ([s]) and Lemma |2](2) we see that triplet distributions with 

256 Tai3Ta^Ti3r^ < are ouly algebraically, but not stochastically identifiable. In fact, for 

257 —T^jSj < '^TapTa^Tp^ < wc get rcal-valucd parameters , and for 4,Taj3Ta^Tp^ < — r^^^ we 

258 get a set of complex-valued parameters. 

259 The following example presents such distributions. 

260 Example 2. Regard the distributions 

pi = (6, 7, 2, 1, 1, 1, 4, 5)/27, p^ = (6, 7, 1, 2, 1, 1, 4, 5)/27, ps = (6, 6, 2, 2, 1, 1, 4, 5)/27. 

261 All three distributions satisfy the conditions ([s]), i.e. they are algebraically identifiable. 

262 For pi the covariance rg^ is negative and the other two positive, while for p2 we have 

263 Ta-y negative and the other two positive. The distribution p3 has only positive pairwise 

264 covariances. 

265 The parameters for pi are real- valued, the parameters for p2 are complex- valued 

266 and ps is stochastically identifiable. 

267 Though this example is artificial it indicates just how sensitive the model is to 

268 misreads in alignments. E.g., the difference between pi and ps could be seen as reading 

269 the pattern Oil under p^ as pattern 001 under pi. 

270 4-^- Stochastically identifiable distributions 

271 The next step is to determine conditions under which a distribution satisfying ([s]) 

272 is stochastically identifiable. These conditions should correspond to the conditions 

273 given by Pearl and Tarsi [41 Theorem 1]. 

274 Example |2] dealt with TapTa-^Tp^ < 0. However, as the following example shows, 

275 positivity of the product does not necessarily yield stochastic identifiability. 

276 Example 3. The tripod distribution 

p = (68, 0, 20, 12, 20, 12, 17, 51)/200 

277 yields positive covariances for all three pairs but also Mq^ = —1/20, i.e. not a 
probability. 



278 
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279 The example contains a pattern of expected zero occurrence. From the tripod 

280 equations we conclude that a stochastically identifiable distribution is strictly positive, 

281 thus this example is slightly contrived. However, as Example [T] showed, a strictly 

282 positive triplet distribution is not necessarily stochastically identifiable either. 

283 In order to get necessary and sufficient conditions on a triplet distribution to be 

284 stochastically identifiable we need to go back to the parameters in ^ and bound them 

285 accordingly. This yields: 

286 Theorem 7. A triplet distribution p is stochastically identifiable if and only if after 

287 suitable state flips the following inequalities hold 

> 0, Tai3\0 > 0, rQ,/3|l > 0, 



300 
301 

302 



T, 



a-y > 0, Ta-ylO > 0, Ta-yll > 0, (7) 

Ti3^ > 0, r^^io > 0, r_g^|i > 0. 

288 In other words, the direction of the correlation between a pair of leaves shall not 

289 be infiuenced by the third leaf. With this we can summarise that a triplet distribution 

290 is stochastically identifiable if it is in 5 \ {Sq U Si) and there is a state fiip such that ([T]) 

291 is satisfied. 

292 Example 4. The tripod distribution p from Example |3] has positive pairwise and 

293 conditional covariances except for r^/sii = —9/2500. Thus it does not satisfy ([T]). 

294 4-3- Non-identifiable cases 

295 The above considerations dealt with cases where a given triplet distribution p is 

296 algebraically identifiable. The final step of the tripod analysis is to regard those 

297 distributions that violate the conditions ([s]). Corollary |4] already discussed the case 

298 where one pairwise covariance is zero while the other two are not and we found that 

299 they were not tree decomposable. In the following we look at the remaining cases. 



Proposition 8. Assume that a triplet distribution p obeys TajsTa-yTp^^ = — but 
TapTa^Tp^ 7^ 0. Then p is not tree decomposable. 



In other words, we found another set of triplet distributions that are not tree 

303 decomposable. 

304 Example 5. The distribution 

p= (16,5,8,15, 14, 5,2, 15)/80 

305 yields Tap = —1/80, r^^ = 1/40 and = 1/8 but x = and hence has no factorisation 

306 in the sense of Q. As in Example [l] we point out here that there seems to be no simple 

307 graphical structure which explains the observed covariances adequately. On the other 

308 hand, similarly to Example [2] the simple act of moving 1/80 from pattern 100 to pattern 

309 110 yields algebraic identifiability. This indicates the level of care required when 

310 inferring meaning from observed covariances. 
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311 Together with Corollary |4] this covers the distributions that are not tree 

312 decomposable. The remaining cases are triplet distributions that are tree decomposable 

313 but not algebraically identifiable. 

314 Proposition 9. Let p be a triplet distribution with r^p = and Tq,^ = 0. Then p is tree 

315 decomposable with infinitely many parameter sets. 



316 The parameter sets are identified by one of the following compositions: 

317 (i) 7^ 0. Then M^^ = M"^ = PaSE, cl G {0, 1}, and for any u,b,c & {0, 1}.- 



318 with free parameters Mj^ 7^ ^7c- 

319 (a) = 0. Then for all a, b, c, G {0, 1} the free parameters can be distributed as 

320 follows: 

(a) M^^ = Mf, = Mj^, = Mf, = psbE and 



322 with free parameters 7^ ^Jz- 

(b) M^^ = Mf, = = M{{ = Ml = Ml = pssc with free 

324 parameter q{. 

325 (c) q{= 0, M^^ = PaSY;, = Psfes, Ml = pyy:c with free parameters 

Mf„ Mf„ Ml. 

327 In other words, the distribution is tree decomposable because process parameters 



328 exist but it is not algebraically identifiable because we have no means to recover the 

329 true parameters or more precisely, there are infinitely many parameters that yield the 

330 same distribution. 

331 Example 6. The triplet distribution 

p=(2,2,2,2,2,2,2,2)/16 

332 yields complete independence of the leaves Tap = r^^ = t^^ = 0, i.e. the case (ii) in 

333 Proposition [9] is to be regarded here. It is not too surprising that such a distribution 

334 yields an infinite number of solutions since the state at the root is completely 

335 undetermined. 

336 Looking again at the cases listed above, we see that is not only pairwise 

337 independent from (X^,X-y) (induced by Tap = Ta-y = 0), but even completely 

338 independent. Then the multiple solutions come from the fact that we can place the root 

339 arbitrarily between /3 and 7. 

340 The good news is, that the non-identifiable cases form a small subset among all 

341 triplet distributions. In fact: 
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342 Proposition 10. Non-identifiable triplet distributions, i.e. distributions violating the 

343 conditions (|5| form a Lebesgue zero set in the set of all possible triplet distributions. 

344 This concludes our analysis of the tripod case. We identified the subset of triplet 

345 distributions that are uniquely algebraically and stochastically identifiable, and those 

346 that are tree decomposable but not algebraically identifiable, or not tree decomposable 

347 at all. 



348 5. Extension to quartet trees 

349 In this section we will explore the implications of extending the results for three 

350 taxa to four taxa. For this section we look at the quartet tree Q = {V, E) with 

V = {C, ^, a, (3, 7, 5}, E = {iC,i^), (C, «), (C, /?), 7), (V', S)}. 



Fig. |A.3| provides an illustration including the four tripod restrictions 



Trri rri rri rri rri i rri rri 

353 Regard the quartet distribution tt = {j^abcd)a,h,c,de{o,i} describing the joint 

354 distribution for a, /3, 7 and 5. If tt is stochastically identifiable and reversible then it 

355 can be reconstructed from the marginalisations on its four tripods [2], i.e. computing 

356 the parameters for all tripods will immediately return the full process. However, the 

357 converse is not necessarily true. As Example [7] below shows, there are cases where each 

358 tripod marginalisation is stochastically identifiable but no quartet tree can be 

359 reconstructed. 

360 [Figure 3 about here.] 

361 Pearl and Tarsi ^ presented an algorithm to reconstruct the topology for an 

362 arbitrary number of taxa. Their algorithm employs the condition that tripods that 

363 share an interior node in the (unknown) tree topology must have the same marginal 

364 distribution at this interior node. Their approach yields an invariant, which for Q 

365 amounts to 

fl{-K) = TaSTp^ - Ta^TpS- (10) 

366 This invariant is related to the four-point-condition [e.g., |26l p. 146] and thus 

367 topologically informative, i.e. it is particular to topology Q. If a distribution tt is from 

368 another tree than /i(7r) ^ 0. 

369 To reconstruct the process parameters as well, more invariants are needed. In 

370 particular, for tt to be algebraically identifiable on Q the parameters obtained from the 

371 tripod marginalisations must satisfy the following properties: 

372 1. The parameters for edges (C,^), {Ci ^"^^ l'' obtained from triplet distributions 

373 p and p, respectively, must be equal. 

374 2. The parameters for edges {ip,'~f), and obtained from triplet distributions 

375 p and p, respectively, must be equal. 
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376 3. The parameters M"^ for the interior edge (C,"^) are obtained from the equations 



M^i = (1 - <)MJ, + Mo^iM7i, 
Ml, = (1 - Mfi)Mji + MfiM7i. 

377 These equations must hold equivalently when 7 is replaced by 6 and the 

378 parameters come from tripod T instead of M. 

379 These conditions imply further restrictions on tt. An indicator for the minimal 

380 number of such conditions is the observation that a quartet distribution tt has 15 

381 degrees of freedom, but there are only 11 process parameters on Q, two for each edge 

382 and one for the root distribution. Thus we need at least four additional conditions or 

383 rather invariants. We will use the above observations to derive an equivalent set of 

384 invariants. 

385 Proposition 11. A quartet distribution tt is algebraically identifiable on Q if its tripod 

386 marginalisations satisfy conditions ^ and the following invariants vanish on tz: 

fl{n) = TaSTpy — Ta^Ti^S, 
/2(7r) = TayTp^s — Tp^TayS-, 

387 The parameters unique up to state flip at the interior nodes are then given by Theorem 

388 and 



1 '^aS'^a/By '^afS'^ayS '^aSy/X.afiy 



2 '^'Ta^y/XajS ^^2^ 



2 '2Tafj yJ'X.a'yS 

The existence of these invariants means that tree decomposable quartet 
distributions form a Lebesgue zero set in the set of all quartet distributions for the 
same reason that the non-identifiable sets are a Lebesgue zero set in the set of all tree 
decomposable distributions. 

Invariant /i comes from the equality of the marginal distributions at the interior 
nodes, as proposed by Pearl and Tarsi [1]. Invariants /2 and /s come from the equality 
of edge transition matrices. Hence, distributions for which /i, /2 and /s vanish will 
uniquely identify topology Q. Therefore, /i to /s are topologically informative. 

However, only distributions for which /q vanishes will be subject to the inferred 
parameters. In other words, in the set of zero points for fi to f^ there is a set of 
distributions that returns the same set of parameters for Q, but only for one of these 
distributions /o vanishes. It would be interesting to investigate how this distribution 
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401 relates to the set it projects from, e.g. if it is related to the possible maximum 

402 likelihood optimum. 

403 Despite the fact that fi to /s are sufficient to infer a topology, /o is also 

404 topologically informative in that it will not vanish for distributions coming from 

405 another tree. 

406 In the case of the CFN model, all triplet covariances vanish. Hence, only 

407 invariants /o and /i are of interest in that case. Therefore, either invariant is sufficient 

408 to identify the associated tree topology. 

409 The parameters for the interior edge do not add more non-identifiable cases. 

410 However, as in the tripod case, further conditions are needed to guarantee quartet 

411 identifiability. 

412 Proposition 12. A quartet distribution is stochastically identifiable if and only if every 

413 triplet marginalisations satisfies both Theorem\^ and the following inequalities 

414 All other relations are covered due to the fact that the quartet distribution p 

415 needs to satisfy the invariants /o — /a- The following example provides a very nice case 

416 in which reconstruction is not possible but offers a very interesting challenge. 

417 Example 7. Chor et al. [27] discussed several examples of distributions with multiple 

418 maxima of the likelihood function. These examples relate to the CFN model, i.e., 
«9 Vahcd = P{i-a)(i-fe){i-c)(i-d) SO that the Hadamard approach can be used. Regard the 

420 symmetric distribution 

p = (14, 0, 0, 3, 0, 2, 1, 0, 0, 1, 2, 0, 3, 0, 0, 14) /40. (14) 

421 Retrieving the statistics yields: 

= 7/40 = r^5, = 3/20 = Ti3S, T^S = I /S = T/j^, 

= Tai35 = '^a-yS = T/S-yS = 0. 

422 The last equality immediately shows, that the above distribution will trivially satisfy 

423 invariants /2 and /s. However, we get fi = —11/1600 and /o = —23/375, i.e. our 

424 observations do not come from the quartet tree defined by the bipartition a(3\'j6. 

425 Looking at the alternative invariants for /i, i.e. at 

^ Ta^r^S - Ta^T/^s = 13/1600, 
^ Ta/ST^S - TaSTf^-y = 3/200, 

426 we see that this distribution comes from none of the available quartet trees. 

427 Nevertheless, we shall have a look at the parameters. Note that the symmetry of 

428 the distribution p implies Mq^ = 1 — M{\ =: Ma- Looking at the numerical values for 

429 the parameters for every tripod tree we find surprising similarities: 
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430 



[Table 1 about here.] 



431 These parameters permit us to infer parameters Mq = 1/14 and = 1/7 such 

432 that e.g. the parameters for a on the tripod trees a^S and can be obtained from 

433 the parameter for tripod tree a/37 by 

M„ = M^(l - M„) + (1 - Mc)M,, M, = M^(l - M„) + (1 - M^)M„, 

434 with analogue assignments for the other leaves. These computations can be visualised 



445 



447 
448 



435 by the network in Fig. |A.4[ The assignment of probabilities for each split permits to 

436 justify the observations for each of the four tripod trees. However, the visualisation is 

437 misleading because the factorisation of the system does not follow the edges in the 

438 network [e.g., ? ? [28] . 

439 [Figure 4 about here.] 

440 6. The connection with Markov invariants 

441 This section investigates the connection between the work presented here and the 

442 concept of Markov invariants as coined by Sumner et al. [13] . To show these relations we 

443 will look back at our covariances and investigate their relationship with the parameters. 

444 Following Allman and Rhodes [6] one can write the three-way-probabilities as a 
2x2x2 tensor P^'^t such that 



pa/3|0 ^ / POOO POlO \ pali\l ^ ( PoOl Poil 
,PlOO PllO/ ' VPlOl Pill 



446 With this as a basis we easily infer our pairwise covariances in terms of determinants of 
dimensional restrictions of P^^'T. E.g., a marginalisation over 7 corresponds to 
pa/3S _ po/3|o _j_ pa/3|i^ TYiQ determinant of this matrix then corresponds to 



det P"^^ = PoosPiiE - PoiePios 

= Plls(P00S + PllE + POIE + PiOe) - (PlOE + Plls)(P01E + PiIe) 
= PllS - PlESPSlS = Tap- 

449 Thus, we have invariably obtained an alternative way to compute the covariances. In a 

450 similar fashion, if we take the determinant of the conditional kernels P°'P\^^ c G {0, 1}, 

451 we arrive at the (not normalised) conditional covariance Tai3\c- 

det P"^!'^ = poocPiic - PoicPioc 

= Pi1c(PESc - Pole - PlOc - Pile) - (PEIc " Pllc)(PlSc " Pile) 
= PllcPSEc - PElcPlEc = ^Q/3|c- 
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452 It must be noted that the determinant has been used earher in connection with LogDet 

453 famihes [e.g., [2S]. In order to relate these findings to the process parameter, let us 

454 denote by 11 = diag(g'') the diagonal matrix of the marginal distribution at the root, 

455 and with 

456 the transition matrix for leaf a. Then the marginalisation of Equation ^ can be 

457 written as 

P"/5s = (M°)^nM^ (15) 

458 where 11 is the marginal distribution at the most recent common ancestor of a and /3. 

459 If £ai3 is defined as the set of edges connecting the root of the tree and the most recent 

460 common ancestor of a and /3 then we compute 11 by 



n = n JJ 



461 If we take the determinant on both sides of Eq. (15) we get 



det P"^^ = det det det n JJ det M^. 



462 We further observe that the determinant in the two-state-case is equal to 

det = 1 - Mo° - M{\ = -(Mfi - M^^). 

463 Going back to a tripod tree under the two-state-model this yields the relation 

T^p = (M(\ - Mo°i)(Mfi - MMi^ - 4)- (16) 

464 This relation has been observed in Steel |25] and forms the basis for LogDet inference. 

465 The covariances r^^ also form the simplest form of Markov invariants. Sumner et al. 

466 \l3l define these terms in general by: 

f{p)=9{p)l[{detM^)'% (17) 

467 with fee G Z denoting the exponent for edge e E E. The term g{'p) describes a function 

468 depicting the relationship of a reduced structure in the tree. Sumner et al. [13] give one 

469 example of such a reduced structure as the tree for which the pendant edges have been 

470 reduced to length zero. In the case of the tripod tree this reduced structure corresponds 

471 to the interior node and hence the distribution p is equivalent to only. In this 

472 setting, Markov invariants are one-dimensional "representations" of the stochastic 
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473 models used for inference, such that the complex structure of these models is retained 



478 
479 
480 
481 



484 

485 



475 In our framework, we rediscover more Markov invariants of type (17) when 

476 investigating how the remaining covariances are related to the process parameters under 

477 the tripod equations Q. In fact, we find: 

T^p, = (Mfi - Mo"i)(Mfi - Mo^i)(M7i - M2,)q[{l - qi){l - 2qi), (18) 
r„,w,, = (M(l - Mo",)2(Mf, - M,\f{Ml - M^.^nqifil - (19) 
X = (Mf, - M-i)^(Mf, - M.^YiMl - M^,,f{q\f{l - q\f. (20) 

with equivalent terms for the other covariances. These equivalences permit a different 



way to prove Theorem |6] from the one we present in Appendix A 

It should be noted that our interpretation of the above Markov invariants as 
covariances only works for the two state model. On the other hand, the form of the 

482 Markov invariants stays valid, even though they might not be as immediately apparent 

483 from the model as in the cases discussed here. However, in the case of the two-state 
model using the notion of covariance permits a good interpretation of the findings. 

We observe for the (not normalised) conditional covariances 



r„^|, = (M° - Mo"i)(Mf, - M',,)M2,Mlqi{l - q{), (21) 



486 i.e., the transition matrix for leaf 7 shall be included into the term g{p) for (17) to be 

487 valid. On the other hand, remember that we did not use these covariances to solve the 

488 tripod equations (|4]). We need them only to formulate the positivity constraints in 

489 Theorem [7| This property is beyond the purely algebraic framework. 

490 In summary, Markov invariants are very useful when investigating properties of 

491 and conditions on leaf distributions p. Especially, they explore the relationship of 

492 process parameters and leaf distribution such that phylogenetic invariants like /i to /a 



493 from Proposition 11 can be easily extracted. We will employ these relationships to 



494 



prove the results of Section |3| 



495 7. Discussion 
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583 Appendix A. Proofs 

584 Proof of Lemma [1} A state flip replaces the probabilities at leaf a implies a "new" 

585 distribution p with pabc = P{i-a)bc, a,b,c E {0,1}. This has the following implications to 

586 the covariances. 

= -PoiE +Psis(l - Pise) = -(pois -PossPsis) 

587 and analogously Tq,^ = — Tq,^ and Tg^ = r^^. Thus, if r^^g and smaller than zero, 

588 then a state flip produces positive covariances and the sign for the overall product 

589 remains the same. □ 

590 Proof of Lemma Using the Markov invariants from Section [6] we immediately see, 

591 that if Tap = due to Mq^ — M^^ = then also Ta-y = and Ta/3'y = 0. Ifg^G{0,l} then 

592 all four covariances are zero. 

593 For point 2 regard (19). But this term will be non- negative as long as is a 



probability, which is a model condition. This completes the proof. □ 

Proof of Corollary^ Select one leaf a G L and deflne Lq = {/3 : Tap < 0}. Flipping the 
states in Lq gives us Tap > for all /3 G L, /3 7^ a by Lemma [Tj Fix now 
(3 f3' E L \ {a}. Then a,(3,(3' , together with the root C of the tree, deflne uniquely a 
tripod tree and the restriction of p to a, (3, (3' must obey the tripod equations. Using 
Lemma [2|^2), on this tripod tree shows now that TapTap'Tppi > 0. This implies that Tpp/ 
is positive, too. □ 

Proof of Corollary |7} A triplet distribution p for which only one covariance is zero does 
not satisfy Lemma [2|^1) and hence is not tripod decomposable. Further, by looking at 



(16) we see that there is also no real- or complex- valued parameter set that would yield 
only one zero covariance. Hence, such a triplet distribution would also not be 
algebraically decomposable. □ 
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Proof of Lemma\5^ We insert the refined parameters into the tripod equations to get: 

Pa,c = qiM'^MbMl + (1 - qi)M^MMc 



= (1 - t)M^MMc + gfMf,Mf,M7„ 

i.e. the tripod equations are recovered with flipped parameters. This completes the 
proof. 

Proof of Theorem We derive the parameters from the tripod equations. As 
mentioned in Section |3.1| there is a linear relationship between p and its 
marginalisations. Thus, flnding a solution for the tripod equations is equivalent to 
flnding the solution for the following set of equations 



607 
608 

609 
610 
611 
612 



□ 



qiM^,M^,M^, + (1 - qi)M^,M^,M, 

qiM^Mi + {^-q{)MSMi. 
qiM^,M^, + il-qi)M^M,, 

1 - qi)MS,M^,, 



7 

01; 



qiM^.M^, H 
qiM^i + (1 



01; 



q\M',, + (1 - qi)M^,, 
q\Ml + (1 - qi)Ml. 



(A.l) 
(A.2) 
(A.3) 
(A.4) 
(A.5) 
(A.6) 
(A.7) 



Equations (A.5)-(A.7) yield 

(l-g^)Mo"i = £a-gi^M(\, {l-qi)M^, = ep-q^M^,, il-qi)M^i = e.-qiu^,,. (A.8) 



614 Inserting (A.8) into (A.2) returns 



;i - q\)ec.p = q{{l- q\)M'^M, + (e, - q\M'^,){ep - q\M^, 



ii;; 



615 and in consequence 



g^Mfi(M{\- 
g^M7,(M{\- 



TaH + qiiSpM^^ - £a/3), 



) 



(A.9) 
(A. 10) 



616 We insert (A.9)-(A.10) back into (A.8) 



(l-g^)Mo^i(M{\-£, 
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617 In the case of = 1 we get from (A. 5) and (A. 2) that M(\ = Sa and e^/? = ^a^p- Hence, 

618 we remove 1 — from the above equation without destroying equahty. Thus, we get 



Ml{M^,-e^)=e,M'^^-e^,. 



(A.ll) 
(A.12) 



619 We insert ( |A.8D in \k.l\ to get 



Applying (A.ll) and (A.12) to this gives us 



621 We can apply the solution formula for quadratic equations provided T/?^ ^ 0, i.e. our 

622 condition ([s]) is satisfied. In that case we get 



2r, 



/37 



2^/37 



2x, 



/37 



(A.13) 



623 Thus we have established the term for M[\. The next step is to derive q{. We insert 

624 (|X9|-( |Al2| into dXi) ) and get 

= (1 - q^i)Tai3Ta^ + qiei3e^{M^^ - e^Y 

625 and hence we get the quadratic relation 

= (1 - g^)xo/3Xo7 - gfx/3^(M° - Eo,)^ 



(A. 14) 



626 We insert (A.13) and get 



'^'''apTa-^'T'lB^ = ^4Xq,^Xq,^X/3^ + (Xo/37 + ^/x)' 
ATapTa^T/B^ = 2q{^{^ + Xo/37) . 



627 We use the equality 



4xo/3r„7X/37 = X - T-^/37 = (Vx + ra/37)(yx - r^p^) 
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628 
629 



630 
631 



and the observation that y/xi\/X ~ Tap-y) = if and only if the conditions in ^ are 
violated to get 



(A. 15) 

thus inferring the proposed term for q{. Next we infer the term for M^^^. To this end we 



insert (A.13) and (A.15) into (A.8): 



-g^(M{\ - e^) = (1 - qi){M^, - ej, 



M, 



01 



Sa + 



2r, 



632 
633 



thus inferring the proposed term. The remaining terms are inferred analogously. This 
completes the proof. □ 

Proof of Theorem^ We bound the parameters from ^ between and 1: 



< 



< 1, 



635 
636 



2 2Vx 

With (|5| this yields positivity for the unconditional covariances. Next we look at Mq^ 

and Mfp 



0<ea + 



T, 



a/3"/ 



2T-/37 



< 1, 



7"a/37 ~ 2(1 - ea)Tp-y < < r„^7 + ^EaTp^ 



and 



0<£„ + 



^«/37 + VX 



< 1, 



-2£aT/37 < T-„^7 + v^X < 2(1 - eo)Tp^, 
-{2eaTi3-y + TajS-r) < a/X < 2(1 - 6a)Ti3^ - Ta/B^. 



Squaring both inequalities reduces the four inequalities to the following two: 



'^a/37 + ^^raisTayTfSy < (2(1 - £:a)T/37 " TQ/37)' 



(A.16) 
(A. 17) 
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639 We look first at inequality (A. 16) and get 



^ Bq^SqiP^ ^aP^a'y Tp'yll- 



Set 



'1 — Ea) = Pose and look at (A. 17): 



^ £^a(^a''737 '^ap-y) 'Tap'^a')^ 
< POOOPOII - POOlPOlO = Tpy\Q- 

641 Hence, we have derived the proposed inequalities. 

642 Proof of Proposition The tripod equations Q imply: 



□ 



X 



643 Together with (18) and (16) we see that there is no set of real or complex parameters 

644 such that X = but TapTa^Tp^ 7^ 0. 



□ 



645 Proof of Proposition^ The cases are easily verified by looking at Equation (16) and 

646 inserting the selected parameters back into (|4]). □ 

647 Proof of Proposition [7^ The function x '■ — )■ C is a nonconstant polynomial 

648 mapping. Thus the set {p G : xip) = 0} is a Lebesgue zero set. The same holds for 

649 the set 



{p G : T^p{p) = or r„^(p) = or Tp^{p) = 0}. 

650 This completes the proof. 

651 Proof of Proposition 1J_. We recover M"^ by inserting the parameters from ([6]) into 

652 ( 1 1 ). To infer the invariants we first look at the equality conditions. We do this 

M"". In particular we look at 



□ 



653 representatively by looking at M - 
Ml, - = Mfi 



Mil + = + M° 



'01 



11 



^01; 



654 and thus 



Tpj 

'T'lp^ + ^'TapTo^yTp^ 



Tpy 
''"a/37 



T, 



ap5 



TpS 



Tpy Tps 



^IpS + ^TgpTaSTps T^Py _ TayS_ 
Tps ' ^/37 ' 



Tps Tp^ 
Tpj 

Tps 



Tps 



Tay 
TaS 
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655 Looking at M = M^^ yields the same equalities. Reproducing the calculations for 

656 M'^' = M'^ yields the invariants /i to fs. 

657 For the inference of /o observe that the equation system (|2| can be written in a 

658 marginalised form, i.e. one replaces the equations in {pabcd)a,b,c,de{o,i} by the linear 

659 transforms Eafi^yS, ^ap-ii ^apSi ^a^ySi ^p-ySi^aPi ^a-yi ^a&i ^ ^ I36i ^"jSi ^jii ^7 ^ud Ej- 

660 We immediately see that all terms but 60,^^5 are covered by our investigation of 

661 the tripod case. We insert the parameters obtained in ^ and (12) into the equation for 

662 Sap'yS to get: 

e.p,s = (1 - qiWlMiii^ - Mt,)M2Mi + MiMjJdi,) 

663 Reordering and restructuring this equation eventually yields invariant /q. This 

664 completes the proof. □ 



665 Proof of Proposition J_2. Theorem [7] covers the first part of the Proposition. The 



666 remaining inequalities are obtained by bounding (12) between and 1 and use the fact 

667 that the covariances are always positive with Lemma |2](1): 



T, 



''"a/3 y/XajS 



□ 
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669 List of Figures 



670 A.l A binary tree with six leaves. Gray lines and nodes describe the hidden 

671 part of the process 

672 A. 2 The tripod tree T 

673 A. 3 The quartet tree Q with its tripod restrictions T, T, T and T. Again, 

674 gray lines and vertices indicate the hidden or unknown variables of the 

675 approach presented here 

676 A. 4 Assignment of mutation probability from the symmetric distribution in 

677 (14). The black lines indicate the triplet a/37. Assigned branch lengths 

678 



m 



are rounded values. 
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Figure A. 2: The tripod tree T. 
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Figure A. 3: The quartet tree Q with its tripod restrictions T, T, T and f. Again, gray hnes and vertices 
indicate the hidden or unknown variables of the approach presented here. 
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Figure A. 4: Assignment of mutation probability from the symmetric distribution in (14 1. The black 
lines indicate the triplet a/37. Assigned branch lengths are rounded values. 
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679 List of Tables 

680 A.l The parameters for each triplet. 
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triplet 


MS, 


Mp 




Ms 


9C 


a (3'-) 


0.0417424 


0.118119 


0.172673 





0.5 


af3S 


0.118119 


0.0417424 





0.172673 


0.5 




0.172673 





0.0417424 


0.118119 


0.5 


h5 





0.172673 


0.118119 


0.0417424 


0.5 



Table A.l: The parameters for each triplet. 
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